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Abstract: DNA microarray experiments, a well-established experimental 
technique, aim at understanding the function of genes in some biological pro- 
cesses. One of the most common experiments in functional genomics research 
is to compare two groups of microarray data to determine which genes are 
differentially expressed. In this paper, we propose a methodology to estimate 
the proportion of differentially expressed genes in such experiments. We study 
the performance of our method in a simulation study where we compare it to 
other standard methods. Finally we compare the methods in real data from 
two toxicology experiments with mice. 



1. Introduction 

The human genome and a number of other genomes have been almost fully se- 
quenced, but the functions of most genes are still unknown. The difficulty is that 
gene expression is only one of the pieces of cellular processes sometimes called bio- 
logical pathways or networks, and it is not yet possible to observe these pathways 
directly. DNA microarray technology has made it possible to quantify and compare 
relative gene expression profiles across a series of conditions many thousands of 
genes at a time. By identifying groups of genes that are simultaneously expressed 
the guesswork of reconstructing biological pathways is expedited. The information 
collected through the years on genes that participate on biological pathways or net- 
works has been used to construct GO (Gene Ontology Consortium Q). The infor- 
mation on differentially expressed genes from a microarray experiment is contrasted 
with the groupings that are known according to existing GO and a determination is 
made on whether or not a certain cellular process is taking place. In addition there 
might be a few genes that are differentially expressed in the experiment but were 
not known to be part of the biological process. These genes become candidates for 
further extending the pathway and will be confirmed by further experimentation 
and also by searching for annotations that describe their function in other processes. 

However, how to determine biological differentially expressed genes accurately is 
a nontrivial issue. Microarray experiments are high throughput in the sense that 
they evaluate the expression levels of thousands of genes at a time but with little 
replications. It is often the case that the number of replicate chips (biological, or 
technical) is 3 to 5 per condition. In addition the distributions of gene expressions 
across samples tend to be skewed and/or heavily tailed and hence they do not follow 
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a normal distribution. In this situation, permutation tests and traditional t-tests 
do not work very well because they have very low power. 

One way to improve the power of the test is to incorporate the GO information 
to the process. Fisher's exact test (Fisher Q) has been proposed as a way to detect 
if a particular subgroup of genes as a whole is differentially expressed. The test is 
applied to a two-way table of the indicator variable detecting the significance of 
the individual gene versus the indicator variable of the group. Another test is to 
consider the test statistic computed by Mean- Log-P, mean(-log(p- value)), (Pavlidis 



et al. [13| and Raghavan et al. [14|), of the genes in the group and compare this to 
the distribution of the statistics under a random subset of genes. 

On the other hand, if when applying real data on GO, the number of differentially 
expressed genes overall is large, then the Fisher's exact test or Mean-Log-P test 
would still have low power. In order to overcome these problems we propose a new 
model approach, which consists of the following steps: 

1. Estimate the proportion of differentially expressed genes. 

2. Estimate the distribution of p-values for genes that are not differentially ex- 
pressed. One would expect that this distribution is uniform but this is not the 
case in many examples that we have studied. The reason might be related to 
the processing of the data and the discarding of genes that take place at some 
stages of the process. Therefore the model has to estimate the distributions 
of null p-values by a semi-parametric or nonparametric method. 

3. Estimate the distribution of p-values corresponding to differentially expressed 
genes. 

4. Proceed by modeling the distribution of Mean-Log -P statistics for genes 
that belong to a subgroup or network. See Raghavan et al. by using the 
estimators of steps 1-3. 

In this paper we concentrate on step 1 of the procedure, which corresponds to the 
estimation of 7r. This quantity 7r is important also in other situations, for example 
to calculate q- values (moderated p-values) proposed by Storey and Tibshirani [16| . 
For step 2-4 of the procedure, we will publish elsewhere as well. In Section 2 we 
propose a method and an algorithm for estimating 7r. In Section 3 we report the 
results of extensive simulation that support the performance of our method as well 
as comparison with other simpler methods. 

Example mice and micel: To illustrate the estimation of 7r, we apply our proce- 
dure for the mouse data sets from toxicology experiments (Amaratunga and Cabrera 
Q). These datasets correspond to typical toxicology experiments where a group of 
mice is treated with a toxic compound and the objective is to find genes that are 
differentially expressed against samples from untreated mice. 

mice and mice2 are two of the data sets that consist ri\ = ri2 = 4 mice in the 
control and treatment groups and total number of genes are G = 4077 from mice 
and G = 3434 for micel respectively. They represent two examples of cDNA chips, 
the first one mice has a high proportion 7r of differentially expressed genes whereas 
mice2 has a much smaller tt. 

The data from such experiments consist of suitably normalized intensities: X g ij, 
where g(g — 1, ...,(?) indicates the genes on the microarray, i(i — 1,2) indexes 
the groups, and j(J — 1, . . . , n.j) is the i-th mouse in the j-th group. Our goal is to 
characterize T, a subset of genes, among the G genes in the experiment that are 
differentially expressed across two groups. 

Methods for determining T, researchers (e.g. Schena et al. (TBI) use fold change, 
but they did not take variability into account. Subsequent improvements were t-test 
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statistics (Efron et al. [f|, Tusher et al. 17 1, and Broberg [Bj]), median-based methods 
(Amaratunga and Cabrera [l[) and Bayes and Empirical Bayes procedures (Lee et 
al. [Io|. Baldi and Long 0], Efron et al. 0, Newton et al. 12 1, and Lonnstedt and 
Speed [0). 

T-tests are the most widely used method for assessing differential expression. 
The assumption of the t-tests is that normalized intensities are approximately nor- 
mally distributed with the same variance across the groups. i.e.X g ij ~ N(/j, g i, a g ). 
For each gene g, a t-statistic is calculated in order to test null hypothesis fi g i = [i g 2 
and a p- value is generated. For small samples the t-test might be replaced by SAM 
or conditional t-test, Ct (Amaratunga and Cabrera 0]) m order to improve the 
power. Here we will follow the model proposed by Amaratunga and Cabrera Q for 
the Ct method. Instead of trying to determine which genes are differentially ex- 
pressed we will estimate the proportion of differentially expressed genes. Of course, 
as a consequence we could also produce an ordered list of genes that would be of 
interest to the biologist, but as we said above the entire procedure will be published 
elsewhere. 



2. Statistical model and inference 

The data for experiments typically consists of suitable iid normalized intensities: 

(2.1) Xgij = fig + Tgi + (JgCgij, 

where /i g and cr g ,g — 1,...,G, are the effect and variance of the g-th gene re- 
spectively, T g i is the effect of the g-th gene in the i-th group (i = 1,2), and 
j(J = 1, . . . ,rij) indexes the samples. This is the same model in Amaratunga and 
Cabrera 0]. The treatment effect of the g-th gene is: 

Tg = \T g2 - Tgl\ 

We assume that e g ij are iid observations from an unknown distribution F and we 
assume that o g and r g are iid observations from unknown distributions F a and F T , 
respectively. F a represents the distribution of the gene variances. F T is likely to have 
a mass at zero with probability 7r representing the proportion of gene that are not 
differentially expressed. If the sample sizes were bigger the unknown distributions 
could be readily estimated by their respective cdf 's but for small sample sizes the 
cdf 's would produce very biased estimators. In the remainder of this section we will 
provide three procedures to estimate the three distributions F, F T , and F a , which 
try to overcome the biases induced by small sample size. 

In the model step: 

1. Estimation of the error distribution F e : 

In (2.1) when the number of samples per group is very small (3, 4, 5) and 
after residuals are subject to two constraints (sample mean X = 0, sample 
standard deviation s = 1) then if we pool the residuals together, the em- 
pirical distribution that is obtained gives a very poor estimator of the error 
distribution F. 

For example: Suppose we sample 1000 genes from a normal distribution with 
two groups of subjects of sizes 4 and 4. The empirical distribution of the 
residuals is close to the true error distribution (which is standard normal) 
which is shown in the left-top graph of Figure 1, but if we also simulated the 
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t- distribution with df.=4, and 10 the qq-plot of the empirical distribution is 
not so good which is shown in the Figure 1. 

One simple way to avoid this problem is to select a subset of genes Sq that 
have small absolute t-values (say below 1 or some threshold that gives a large 
set of numbers). For each gene in Sq, both samples are pooled together and 
normalized by subtracting the gene mean and dividing by the standard devi- 
ation. If the sample size per group is very small (3, 4, 5) instead of the sample 
mean and standard deviation it is much better to use Huber M-estimator of 
location and scale (Huber 0) as shown by Figure 1. This will result in a table 
of residuals e g i j , g S Sc ■ The error distribution F e is estimated by 



(2.2) F e = EmpiricalCDF{e gi j,g G S G ,i = 1,2, j = 1, 



ih 



Figure 1 shows the qq-plot for the estimated error distribution on t- distribu- 
tion. The improvement is very clear. 
Estimating F a : 

We follow the method described in Amaratunga and Cabrera 0, They 
pointed out that the empirical distribution, F a , of s g is a very poor estimator 
of the distribution F a , because on average F a is much more scattered than 
F a . They proposed an estimate F a of F a that shrinks F a towards its center 
and hence producing a better estimator of F a . A similar algorithm will be 
discussed in 3. 

Estimating F T : (determine the proportion of differential expressed genes) 
We said earlier that r g is drawn from some distribution F T . We expect that F T 
has a mass at zero of probability F T (0) > 0, which represents the genes that 
are not differentially expressed. In order to estimate the probability P{r g = 
0) we apply an algorithm that will produce an estimator F T such that the 
Ep, (F*(t)) = F T (t), where F*(t) is the random variable representing the 
empirical cdf of r** at value t, which is constructed in following algorithm 
and F T (t) represents the actual observed value. 
The algorithm is as follows: 
Algorithm: 
Step 1: 



1.1 

1.2 

1.3 

1.4 

1.5 
1.6 

1.7 
1.8 

1.9 



Draw a random sample, s* 
bution of a. 



from F„ , which our estimate of the distri- 



Estimate the error distribution F e with the empirical distribution F e 
defined in (2.2). 

Take a random sample (with replacement): r g ij ~ F e for i = 1,2, j = 
1, . . . ,m,g = 1, . . . , N. 

Draw a sample r* from F T (t) = /{t>o}, where I{t>o} = 1 if £ > and 

Construct the pseudo-data: X*^ 

Reconstruct the distribution F% 
bution of r** by pseudo-data: r** 

Start by setting F^ old) = F T . 

Let F± new) = F T (F;^ d) (F T )). 



= s g * r gl j , X* 2j = t* +s g * r g2j . 
= E(P;\F T ), where F* is the distri- 

I XT* V"* I 

- \ X g2 - X gl\- 



Set 



(old) 



F^ new) and go to 1.3). 
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Q-Q plot Normal distrib. 





Q-Q plot t-distrib. df=4 q.q p | ot t-distrib. df=4 




-3-2-10123 -3-2-10123 
(Before truncation) (After truncation) 

Fig 1 . A comparison of the error distribution estimates obtained from the empirical distribution 
(left) and our estimator (right), when the errors come from a Normal(0,l), tin and tn distribu- 
tions. 
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1.10) Iterate until convergence (approximately 100 iterations). At conver- 
gence we get our final estimate P T = p( new \ 

1.11) Give a cutoff point, say rj, which is a 95% quantile of the final P T (t). 
Step 2: 

2.1) Repeat 1.4)-1.8) using all original data Xgij and the estimated F T . 

2.2) Get the estimated percentage of t** which is greater than rj x 95% 
quantile of standard normal. 

Theorem 2.1. At convergence the estimator F T is a fix point of the step in 
1.8) of the algorithm. That is P T = F T (F^,~ 1 (P T )), then we have 

(2.3) e Pt (P;) = f t . 

Proof. If the algorithm converges, then F T = P T (F*~ 1 (P T )). Thus 
P T o F- 1 o P T = F| t = E(F T \F T ) = P T 

=> F T O F- 1 = F T O F- 1 

(F T o F" 1 ) 2 = / 

PrOp- 1 =1 

or P T o P^ 1 = -I (impossible, since P T ,P T > 0) 

e Pt {P;) = e Pt (P t ) = F T = F T . □ 

Remark 1. Base on our simulations, the algorithm converges in at most 100 
iterations. 

Remark 2. At convergence, P T is very close to P T and P* is also very close 
to F T , such that we have nice result Ep (F*) — F T . 

Remark 3. This is a two-stage estimation method. We split data into two 
pieces. One is non-informative data, which produces a good estimation of the 
error distribution. The other is the informative data, we use shrinkage method 
to estimate the distribution of r g , which gives the better result. 

Performance assessment: To assess the performance of this method, we sim- 
ulated data points, which are normally and independently distributed. 

1. X g ij ~ N(r g , 1), where G — 10000, n\ = n 2 = 4 and we assume that G S i g — 
1000, . . . , 9000 of G genes were differentially expressed between two groups 
and their difference was S, i.e. r g = 6(5 = 1,2) for all g = 1, . . . , G S i gi and 
T g = otherwise. 

2. X g ij ~ N(r g , o-g), where G — 10000, n\ = n 2 = 4 and we assume that G S i g = 
1000, . . . , 9000 of G genes were differentially expressed between two groups 
and their difference was S = 1, 2, for all g = 1, ... , G S i g , and r g = otherwise 
and <jg are chi-square distributed with degrees of freedom 3. We calibrate the 
mean of ct 2 . to 1. i.e. cr 2 /3. 

We compare our method to permutation tests and t-tests using a threshold 
of 0.05 to determine significance. These two methods are standard in biological 
applications. Our method is much more accurate than other two methods (Table 
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1-4, Figure 2). Each cell in the table is the mean (standard deviation) based on 10 
times simulations on each condition. In Figure 2, the straight line represents the 
true values and the red line is obtained by the smooth spline function. We also 
calculate the pFDR of our method in different values of lambda (Table 5-6). pFDR 
decreases when the true value increases. 

3. Discussion and extensions 

In this paper we propose an algorithm for estimating the proportion of differentially 
expressed genes in a microarray experiment. We also show that the estimator of 
the distribution of the variance converges to a fix point. We performed a simulation 
study to check the performance of our estimate and it is shown to be "satisfac- 
tory" and we show that our method has better performance than other alternatives 
such as permutation tests and standard two-sample t- test. The simulations were 
performed under normal and gamma error distribution and with constant vari- 
ances and chi-square variances. In addition we illustrate the method with real data 
examples on mice and mice2 (Table 7, Figure 3). In the real data examples we 
obtained estimates of the proportion of significant genes that were more realistic 
than those produced by the other methods. Hence, this algorithm gives us more 
accurate prediction to detect differential genes. 

This same method is generally extendable to other more complicated modeling 
procedures such as the one-way ANOVA F-test and other linear models. The same 
model is used and the same ideas are easily extendable into a second paper. Another 
paper will deal with the GO issues, by modeling the p-values and getting a null 
distribution that will be used to detect differentially expressed gene network and 
subsets. 



Table 1 
Normal(0,l) 



6 


true A 


0.1 


0.2 


0.3 


0.4 


0.5 


0.6 


0.7 


0.8 


0.9 


1 


t-test 


0.066 


0.085 


0.103 


0.119 


0.136 


0.154 


0.171 


0.186 


0.207 






(0.002) 


(0.003) 


(0.002) 


(0.003) 


(0.003) 


(0.005) 


(0.004) 


(0.004) 


(0.004) 


1 


Permutation 


0.039 


0.051 


0.063 


0.073 


0.085 


0.096 


0.107 


0.116 


0.130 




test 


(0.002) 


(0.002) 


(0.002) 


(0.003) 


(0.003) 


(0.003) 


(0.003) 


(0.003) 


(0.003) 


1 


New method 


0.071 


0.163 


0.226 


0.282 


0.304 


0.422 


0.473 


0.479 


0.518 






(0.058) 


(0.091) 


(0.084) 


(0.072) 


(0.049) 


(0.081) 


(0.105) 


(0.145) 


(0.120) 


2 


t-test 


0.109 


0.171 


0.234 


0.294 


0.354 


0.415 


0.474 


0.534 


0.595 






(0.002) 


(0.002) 


(0.003) 


(0.003) 


(0.003) 


(0.004) 


(0.005) 


(0.004) 


(0.004) 


2 


Permutation 


0.074 


0.120 


0.168 


0.214 


0.259 


0.305 


0.350 


0.397 


0.442 




test 


(0.003) 


(0.002) 


(0.002) 


(0.003) 


(0.004) 


(0.003) 


(0.005) 


(0.005) 


(0.004) 


2 


New method 


0.087 


0.196 


0.321 


0.431 


0.522 


0.635 


0.720 


0.823 


0.923 






(0.020) 


(0.022) 


(0.034) 


(0.033) 


(0.030) 


(0.045) 


(0.034) 


(0.022) 


(0.021) 
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Table 2 
N(0,a),a ~ xf 3 J3 






true A 


n 1 
U. 1 


U.z 


U.o 


n a 
U.4 


U.O 


U.O 


U. ( 


U.o 


u.y 


I 


t-fpQt 

Li Lv_.o L 


0.066 


0.087 


0.110 


0.134 


0.157 


0.180 


0.204 


0.227 


0.252 






(0.001) 


(0.004) 


(0.002) 


(0.004) 


(0.003) 


(0.004) 


(0.003) 


(0.004) 


(0.003) 


1 


Permutation 


0.045 


0.060 


0.077 


0.095 


0.112 


0.129 


0.148 


0.163 


0.182 




test 


(0.002) 


(0.002) 


(0.002) 


(0.003) 


(0.002) 


(0.003) 


(0.004) 


(0.003) 


(0.002) 


1 


New method 


0.079 


0.145 


0.153 


0.301 


0.327 


0.436 


0.513 


0.576 


0.577 






(0.072) 


(0.096) 


(0.040) 


(0.069) 


(0.062) 


(0.119) 


(0.138) 


(0.138) 


(0.116) 


2 


t-test 


0.105 


0.172 


0.237 


0.303 


0.370 


0.435 


0.498 


0.565 


0.630 






(0.002) 


(0.002) 


(0.003) 


(0.003) 


(0.004) 


(0.003) 


(0.003) 


(0.006) 


(0.003) 


2 


Permutation 


0.080 


0.134 


0.186 


0.241 


0.295 


0.347 


0.400 


0.451 


0.508 




test 


(0.003) 


(0.002) 


(0.003) 


(0.004) 


(0.003) 


(0.004) 


(0.004) 


(0.005) 


(0.005) 


2 


New method 


0.111 


0.207 


0.311 


0.413 


0.514 


0.609 


0.712 


0.811 


0.914 






(0.027) 


(0.034) 


(0.032) 


(0.030) 


(0.025) 


(0.022) 


(0.017) 


(0.018) 


(0.015) 



Table 3 
Gamma(l, 1) 



s 


true A 


0.1 


0.2 


0.3 


0.4 


0.5 


0.6 


0.7 


0.8 


0.9 


1 


t-test 


0.067 


0.094 


0.123 


0.150 


0.178 


0.207 


0.233 


0.264 


0.292 






(0.002) 


(0.004) 


(0.003) 


(0.004) 


(0.004) 


(0.004) 


(0.004) 


(0.004) 


(0.003) 


1 


Permutation 


0.053 


0.075 


0.099 


0.120 


0.143 


0.168 


0.190 


0.213 


0.237 




test 


(0.001) 


(0.002) 


(0.003) 


(0.003) 


(0.004) 


(0.003) 


(0.003) 


(0.006) 


(0.003) 


1 


New method 


0.059 


0.151 


0.225 


0.310 


0.321 


0.377 


0.482 


0.504 


0.626 






(0.043) 


(0.035) 


(0.075) 


(0.062) 


(0.099) 


(0.110) 


(0.094) 


(0.119) 


(0.107) 


2 


t-test 


0.108 


0.177 


0.246 


0.313 


0.381 


0.450 


0.521 


0.588 


0.657 






(0.002) 


(0.002) 


(0.003) 


(0.003) 


(0.005) 


(0.003) 


(0.004) 


(0.005) 


(0.005) 


2 


Permutation 


0.090 


0.151 


0.212 


0.272 


0.330 


0.391 


0.454 


0.514 


0.576 




test 


(0.003) 


(0.002) 


(0.002) 


(0.003) 


(0.004) 


(0.004) 


(0.003) 


(0.005) 


(0.004) 


2 


New method 


0.126 


0.232 


0.310 


0.417 


0.515 


0.613 


0.712 


0.802 


0.912 






(0.048) 


(0.045) 


(0.024) 


(0.020) 


(0.023) 


(0.010) 


(0.015) 


(0.014) 


(0.013) 



Table 4 



5 


true A 


0.1 


0.2 


0.3 


0.4 


0.5 


0.6 


0.7 


0.8 


0.9 


1 


t-test 


0.065 


0.086 


0.109 


0.130 


0.153 


0.174 


0.197 


0.218 


0.243 






(0.003) 


(0.003) 


(0.005) 


(0.004) 


(0.003) 


(0.004) 


(0.003) 


(0.004) 


(0.002) 


1 


Permutation 


0.043 


0.058 


0.075 


0.090 


0.106 


0.122 


0.138 


0.153 


0.170 




test 


(0.002) 


(0.002) 


(0.004) 


(0.003) 


(0.003) 


(0.003) 


(0.004) 


(0.003) 


(0.003) 


1 


New method 


0.074 


0.141 


0.208 


0.212 


0.319 


0.368 


0.490 


0.530 


0.641 






(0.060) 


(0.100) 


(0.065) 


(0.074) 


(0.080) 


(0.091) 


(0.133) 


(0.128) 


(0.084) 


2 


t-test 


0.112 


0.177 


0.241 


0.309 


0.373 


0.440 


0.507 


0.575 


0.639 






(0.002) 


(0.002) 


(0.002) 


(0.005) 


(0.003) 


(0.003) 


(0.004) 


(0.005) 


(0.005) 


2 


Permutation 


0.083 


0.136 


0.190 


0.246 


0.298 


0.352 


0.408 


0.461 


0.517 




test 


(0.002) 


(0.003) 


(0.002) 


(0.004) 


(0.003) 


(0.004) 


(0.004) 


(0.006) 


(0.006) 


2 


New method 


0.113 


0.205 


0.309 


0.411 


0.516 


0.610 


0.718 


0.811 


0.918 






(0.030) 


(0.013) 


(0.028) 


(0.027) 


(0.022) 


(0.016) 


(0.027) 


(0.017) 


(0.013) 



Table 5 

pFDR for our method with Normal(0,l) error distribution 



true A 


0.1 


0.2 


0.3 


0.4 


0.5 


0.6 


0.7 


0.8 


0.9 


<5 = 1 
<5 = 2 


0.5471 
(0.1679) 
0.1963 
(0.0753) 


0.3768 
(0.1371) 
0.1924 
(0.0741) 


0.3823 
(0.0537) 
0.2416 
(0.0876) 


0.2372 
(0.0535) 
0.1533 
(0.0393) 


0.2372 
(0.0535) 
0.1215 
(0.0406) 


0.1924 
(0.0354) 
0.0965 
(0.0242) 


0.1486 
(0.0363) 
0.0841 
(0.0255) 


0.0860 
(0.0209) 
0.0601 
(0.0093) 


0.0482 
(0.0131) 
0.0465 
(0.0112) 
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Fig 2. Example comparing our method to the Permutation and t methods. The true errors are 
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Fig 3. Density estimators for the p-values obtained from two toxicology datasets. 
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Table 6 

pFDR for our method with Normal(0, o 2 ), o ~ XfQ-i/3 error distribution 



true A 


0.1 


0.2 


0.3 


0.4 


0.5 


0.6 


0.7 


0.8 


0.9 


<5 = 1 


0.634 


0.480 


0.375 


0.323 


0.233 


0.185 


0.102 


0.094 


0.047 




(0.069) 


(0.060) 


(0.060) 


(0.040) 


(0.053) 


(0.048) 


(0.018) 


(0.017) 


(0.0135) 


5 = 2 


0.325 


0.226 


0.167 


0.139 


0.119 


0.107 


0.074 


0.063 


0.037 




(0.099) 


(0.054) 


(0.042) 


(0.022) 


(0.020) 


(0.016) 


(0.017) 


(0.014) 


(0.0047) 



Table 7 

Results for the three methods applied to 
two real examples from toxicology 



Estimated tt 


Mice 


Mice2 


t — test 


0.245 


0.499 


Permutation test 


0.220 


0.443 


New method 


0.107 


0.363 
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